48 research outputs found

    Preliminary Work on Speech Unit Selection Using Syntax Phonology Interface

    Get PDF
    This paper proposes an approach which uses a syntax-phonology interface to select the most appropriate speech units for a target sentence. The selection of the speech units is done by constructing the syntax-phonology tree structure of the target sentence. The construction of the syntax-phonology tree is adapted from the example-based parsing of UTMK machine translation

    Building An Ontology-Based Multilingual Lexicon For Word Sense Disambiguation In Machine Translation.

    Get PDF
    Word sense disambiguation (WSD) requires the establishment of a list of the different meanings of words. WSD efforts in machine translation require) in addition) the equivalent translation words in target languages

    Lost in Translation: Word Sense Disambiguation.

    Get PDF
    In natural languages, a word can take on different meanings in different contexts. Word sense disambiguation (WSD) refers to the task of determining the correct meaning or sense of a word in context

    Learning-to-Translate Based on the S-SSTC Annotation Schema

    Get PDF

    A Synchronization Structure Of SSTC And Its Applications In Machine Translation.

    Get PDF
    In this paper, a flexible annotation schema called (SSTC) is introduced. In order to describe the correspondence between different languages, we propose a variant of SSTC called synchronous SSTC (S-SSTC). We will also describe how S-SSTC provides the flexibility to treat some of the non-standard cases, which are problematic to other synchronous formalisms

    Digitising Dictionaries For Advanced Look-Up And Lexical Knowledge Research In Malay.

    Get PDF
    Electronic dictionaries need not be mere OCR digitised versions of their paper-form counterparts: they can be made more computer-tractable to facilitate more meaningful operations and data exchange. For instance, explicitly annotating different fields in a dictionary entry allows more targeted look-ups, as we will show using Kamus Dewan as an example. Dictionary data can also be reorganised to enable semantic base search. The wordnet lexical database is one such model, for which we created a prototype for the Malay language. As both the proposed annotated Kamus Dewan and Malay WordNet are compiled according to established standards and guidelines, the data can be aligned with similar lexical resources of other languages. This provides a means for mutual sharing, interchange and enrichment of lexical data and knowledge between Malay and other languages

    Porting SIMRJGSA Algorithms For Mapping And Alignment To Malay-English Bitexts.

    Get PDF
    Parallel texts or Bitexts - where the same content is available in several languages, due to document translation, are becoming plentiful and available, both in private data warehouses and on publicly accessible sites on the WWW

    Building A Semantic-Primitive-Based Lexical Consultation System.

    Get PDF
    The paper describes the design of semantic primitive-based lexical consultation system and the possible processes which will be performed on a machine-readable dictionary (MRD) and corpus to produce a machine-tractable dictionary
    corecore